A Discretization Method Based on Maximizing the Area under Receiver Operating Characteristic Curve

نویسندگان

  • Murat Kurtcephe
  • H. Altay Güvenir
چکیده

Many machine learning algorithms require the features to be categorical. Hence, they require all numeric-valued data to be discretized into intervals. In this paper, we present a new discretization method based on the receiver operating characteristics (ROC) Curve (AUC) measure. Maximum area under ROC curve-based discretization (MAD) is a global, static and supervised discretization method. MAD uses the sorted order of the continuous values of a feature and discretizes the feature in such a way that the AUC based on that feature is to be maximized. The proposed method is compared with alternative discretization methods such as ChiMerge, Entropy-Minimum Description Length Principle (MDLP), Fixed Frequency Discretization (FFD), and Proportional Discretization (PD). FFD and PD have been recently proposed and are designed for Naïve Bayes learning. ChiMerge is a merging discretization method as the MAD method. Evaluations are performed in terms of M-Measure, an AUC-based metric for multi-class classi ̄cation, and accuracy values obtained from Naïve Bayes and Aggregating One-Dependence Estimators (AODE) algorithms by using real-world datasets. Empirical results show that MAD is a strong candidate to be a good alternative to other discretization methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Technical Report No: BU-CE-1001 A Discretization Method based on Maximizing the Area Under ROC Curve

We present a new discretization method based on Area under ROC Curve (AUC) measure. Maximum Area under ROC Curve Based Discretization (MAD) is a global, static and supervised discretization method. It discretizes a continuous feature in a way that the AUC based only on that feature is to be maximized. The proposed method is compared with alternative discretization methods such as Entropy-MDLP (...

متن کامل

Lae-Jeong Park and Jung-Ho Moon A Learning Method of Directly Optimizing Classifier Performance at Local Operating Range

This paper addresses an effective learning method that enables us to directly optimize neural network classifier's discrimination performance at a desired local operating range by maximizing a partial area under a receiver operating characteristic (ROC) or domain-specific curve, which is difficult to achieve with classification accuracy or mean squared error (MSE)-based learning methods. The ef...

متن کامل

Risk Estimation by Maximizing the Area under ROC Curve

Risks exist in many different domains; medical diagnoses, financial markets, fraud detection and insurance policies are some examples. Various risk measures and risk estimation systems have hitherto been proposed and this paper suggests a new risk estimation method. Risk estimation by maximizing the area under a receiver operating characteristics (ROC) curve (REMARC) defines risk estimation as ...

متن کامل

Application of adjusted-receiver operating characteristic curve analysis in combination of biomarkers for early detection of gestational diabetes mellitus

Introduction: In medical diagnostic field, evaluation of diagnostic accuracy of biomarkers or tests has always been a matter of concern. In some situations, one biomarker alone may not be sufficiently sensitive and specific for prediction of a disease. However, combining multiple biomarkers may lead to better diagnostic.  The aim of this study was to assess the performance of combination of bio...

متن کامل

Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation

This review provides the basic principle and rational for ROC analysis of rating and continuous diagnostic test results versus a gold standard. Derived indexes of accuracy, in particular area under the curve (AUC) has a meaningful interpretation for disease classification from healthy subjects. The methods of estimate of AUC and its testing in single diagnostic test and also comparative studies...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJPRAI

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2013